Modeling Sentences in the Latent Space

نویسندگان

  • Weiwei Guo
  • Mona T. Diab
چکیده

Sentence Similarity is the process of computing a similarity score between two sentences. Previous sentence similarity work finds that latent semantics approaches to the problem do not perform well due to insufficient information in single sentences. In this paper, we show that by carefully handling words that are not in the sentences (missing words), we can train a reliable latent variable model on sentences. In the process, we propose a new evaluation framework for sentence similarity: Concept Definition Retrieval. The new framework allows for large scale tuning and testing of Sentence Similarity models. Experiments on the new task and previous data sets show significant improvement of our model over baselines and other traditional latent variable models. Our results indicate comparable and even better performance than current state of the art systems addressing the problem of sentence similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating Sentences from a Continuous Space

The standard recurrent neural network language model (rnnlm) generates sentences one word at a time and does not work from an explicit global sentence representation. In this work, we introduce and study an rnn-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences. This factorization allows it to explicitly model holistic propert...

متن کامل

Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization

We propose a new unsupervised sentence salience framework for Multi-Document Summarization (MDS), which can be divided into two components: latent semantic modeling and salience estimation. For latent semantic modeling, a neural generative model called Variational Auto-Encoders (VAEs) is employed to describe the observed sentences and the corresponding latent semantic representations. Neural va...

متن کامل

Using Multidimensional Scaling for Assessment Economic Development of Regions

Addressing socio-economic development issues are strategic and most important for any country. Multidimensional statistical analysis methods, including comprehensive index assessment, have been successfully used to address this challenge, but they donchr('39')t cover all aspects of development, leaving some gap in the development of multidimensional metrics. The purpose of the study is to const...

متن کامل

Linear Formulas in Continuous Logic

We prove that continuous sentences preserved by the ultramean construction (a generalization of the ultraproduct construction) are exactly those sentences which are approximated by linear sentences. Continuous sentences preserved by linear elementary equivalence are exactly those sentences which are approximated in the Riesz space generated by linear sentences. Also, characterizations for linea...

متن کامل

Generating Contradictory, Neutral, and Entailing Sentences

Learning distributed sentence representations remains an interesting problem in the field of Natural Language Processing (NLP). We want to learn a model that approximates the conditional latent space over the representations of a logical antecedent of the given statement. In our paper, we propose an approach to generating sentences, conditioned on an input sentence and a logical inference label...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012